Crash-Quiescent Failure Detection

نویسندگان

  • Srikanth Sastry
  • Scott M. Pike
  • Jennifer L. Welch
چکیده

A distributed algorithm is crash quiescent if it eventually stops sending messages to crashed processes. An algorithm can be made crash quiescent by providing it with either a crash notification service or a reliable communication service. Both services can be implemented in practical environments with failure detectors. Therefore, crash-quiescent failure detection is fundamental to system-wide crash quiescence. We establish necessary and sufficient conditions for crash-quiescent failure detection in partially synchronous environments where a bounded, but unknown, number of consecutive messages can be arbitrarily late or lost. Without a correct majority of processes, not even the weakest oracle for fault-tolerant consensus, 3W, can be implemented crash quiescently. With a correct majority, however, the eventually perfect failure detector, 3P, is possible. Our 3P algorithm is correct in all runs, but improves performance via crash quiescence in any run with a correct majority. We also present a refinement of our 3P algorithm to mitigate the overhead of achieving crash quiescence; the resulting bit complexity per utilized link is asymptotically better than or equal to that of non-crash-quiescent

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Communication-efficient and crash-quiescent Omega with unknown membership

ARTICLE INFO ABSTRACT Keywords: Distributed computing Fault tolerance Unreliable failure detectors The failure detector class Omega (Í2) provides an eventual leader election functionality, i.e., eventually all correct processes permanently trust the same correct process. An algorithm is communication-efficient if the number of links that carry messages forever is bounded by n, being n the numbe...

متن کامل

Termination Detection in Systems Where Processes May Crash and Recover —

An algorithm solving the termination detection problem observes a computation of a distributed system and announces “termination” if the computation has come to an end. This work addresses termination detection in systems where processes fail by crashing and may restart later on. The new definition of robust-restricted termination sensible in the crash-recovery model is developed. A computation...

متن کامل

Influence of Elastic Support on the Energy Absorption in Front Crash Ductile Failure Criterion

Thin-walled structures like crash boxes may be used as energy absorption members in automotive chassis. There have been many studies addressing the behaviors of energy absorption members on frontal crash. These researches have attempted to predict the energy absorption and maximum impact load in shell structures. The energy absorption and maximum impact load depend on many parameters including ...

متن کامل

Termination Detection in an Asynchronous Distributed System with Crash-Recovery Failures

We revisit the problem of detecting the termination of a distributed application in an asynchronous message-passing model with crash-recovery failures and failure detectors. We derive a suitable definition of termination detection in this model but show that this definition is impossible to implement unless you have a failure detector which can predict the future. We subsequently weaken the pro...

متن کامل

Efficient Reduction for Wait-Free Termination Detection in a Crash-Prone Distributed System

We investigate the problem of detecting termination of a distributed computation in systems where processes can fail by crashing. Specifically, when the communication topology is fully connected, we describe a way to transform any termination detection algorithm A that has been designed for a failure-free environment into a termination detection algorithm B that can tolerate process crashes. Ou...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009